Language and Speech Processing
نویسندگان
چکیده
At the end of the 19th century, L. L. Zamenhof proposed Esperanto; it was intended as a global language to be spoken and understood by everyone. The inventor was hoping that a common language could resolve global problems that lead to con ict. This idealistic idea did not reach its full potential, yet there are still scienti c elds pursuing its legacy, though not necessarily from an ideological point of view. The Internet contains billions of web pages, as they come in all kinds of languages, a great deal of information is not available to us. A practical application would be a browser that translates these pages in a preferred language. In the eld of statistical machine translation (SMT), we try to build algorithms to translate from one language to the other by mere statistics taken from large bi-text corpora. When we adopt the SMT approach, we represent all individual or groups of words (cepts) as having a connection to zero, one or many foreign cepts under a probability value. This means that both the alignments and the probabilities need to be extracted from the bi-text. The main problem here is that we need the alignments to estimate the probabilities and the probabilities to estimate the alignments. These kinds of problems can be solved with the EM algorithm, one approach [2, 3] is particularly favored due to its estimation of reasonable models. In this paper, we will present the SMT Aligner (SMTA), which word aligns French-English sentences given word translation probabilities provided by [1] to estimate good alignments under the assumptions of IBM model I. We will then compare results from alignments that use the null word with ones that don't. We will also introduce a heuristic to increase the F1-score results by 4 percent. In section 2 we describe the theory which forms the basis of the SMTA. Section 3 is used to describe the research methodology. In Section 4 we present our results and we conclude section 5 with the discussion.
منابع مشابه
Teaching approaches to Computer Assisted Language Learning
Computers have been used for language teaching ever since the 1960's.Learning a second language is a challenging endeavor, and, for decades now, proponents of computer assisted language learning (CALL) have declared that help is on the horison. We investigate the suitability of deploying speech technology in computer based systems that can be used to teach foreign language skills. In this case,...
متن کاملDeveloping a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity
Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...
متن کاملUsing functional magnetic resonance imaging (fMRI) to explore brain function: cortical representations of language critical areas
Pre-operative determination of the dominant hemisphere for speech and speech associated sensory and motor regions has been of great interest for the neurological surgeons. This dilemma has been of at most importance, but difficult to achieve, requiring either invasive (Wada test) or non-invasive methods (Brain Mapping). In the present study we have employed functional Magnetic Resonance Imaging...
متن کاملMusic Training Program: A Method Based on Language Development and Principles of Neuroscience to Optimize Speech and Language Skills in Hearing-Impaired Children
Introduction: In recent years, music has been employed in many intervention and rehabilitation program to enhance cognitive abilities in patients. Numerous researches show that music therapy can help improving language skills in patients including hearing impaired. In this study, a new method of music training is introduced based on principles of neuroscience and capabilities of Persian languag...
متن کاملRehabilitation Approaches for Drug Abuse, Addiction and Pediatric Issues
The current issue of the Iranian Rehabilitation Journal contains original research evaluating the efficacy of addiction rehabilitation an evaluation of a child rehabilitation system for community based research, reading program for children with down syndrome, auditory stream segregation in auditory processing disorder, speech and language disorders, quality of life of adolescents with hearing ...
متن کاملUsing functional magnetic resonance imaging (fMRI) to explore brain function: cortical representations of language critical areas
Pre-operative determination of the dominant hemisphere for speech and speech associated sensory and motor regions has been of great interest for the neurological surgeons. This dilemma has been of at most importance, but difficult to achieve, requiring either invasive (Wada test) or non-invasive methods (Brain Mapping). In the present study we have employed functional Magnetic Resonance Imaging...
متن کامل